-
-
Notifications
You must be signed in to change notification settings - Fork 38
Integrate zimfarm dev setup #1027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
62d06b6 to
de7afb9
Compare
8176974 to
fc088ff
Compare
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (80.64%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #1027 +/- ##
==========================================
- Coverage 92.90% 92.75% -0.16%
==========================================
Files 73 73
Lines 4229 4249 +20
==========================================
+ Hits 3929 3941 +12
- Misses 300 308 +8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
audiodude
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about this approach.
audiodude
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also update the name of this PR to something like "Integrate Zimfarm dev setup"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments.
I also feel like nobody did ran this from end-to-end since the zimfarm worker resources were not adequate, or do I miss something?
We should really run this from end-to-end to ensure this setup works correctly.
And by end-to-end, I mean / propose following scenario:
- user log in WP1 UI
- user creates a simple selection (one or two WPEN article for instance, this detail is not important)
- user requests this selection to be ZIMed
- ZIM file is correctly created
- WP1 UI displays ZIM location
- user can download the ZIM
I totally understand that @elfkuzco might need help from @audiodude and myself regarding details on how to run this end-to-end test, but we should not close this issue / PR until we are sure that everything works from end-to-end. Otherwise this is mostly just a waste.
Yes I can definitely help with that. I'll patch this PR and try setting up/running the zimfarm locally and confirm that I can create and download ZIMs. |
|
Updated the files with the recent changes:
|
|
Code LGTM, waiting for e2e test from @audiodude (if I get it correctly) to give my formal approval |
|
I made some minor tweaks to the PR, but it's still not working. My Zimfarm is still reporting the following for requests to EDIT: This is after following the directions in the README and updating my local |
|
Hum, this is indeed a problem. To unblock you, please set It is however not the proper way to solve this situation to merge this PR. We will continuously have new offliner definitions arriving, and all of them should be stored in the local Zimfarm DB so that dev can use mostly any mwoffliner version / definition version. I feel like the |
|
@elfkuzco Looking at this again now. I tried to update some of the credentials*example files per @benoit74's suggestion. I'm still getting failures in my Zimfarm: I have this in my credentials.py.dev: I also tried it with the following: And still got: as a reminder, this is the stanza in docker-compose-dev.yml: |
|
Can I see your logs for the task worker and worker manager? My suspicion is that this is probably an older image of the task worker which still uses the dnscache. After openzim/zimfarm#1515, |
|
Did a |
|
Actually, I don't think it's a problem with Zimfarm, because when I paste http://wp1bot-web-dev:5000/v1/builders/04c0f118-f9e5-449a-a8de-f05026555482/selection/latest.tsv and change the server to |
|
Oh, don't use https with the minio URL. |
|
Okay I've updated my credentials and things seem to be working now, except for openzim/mwoffliner#2578 so I actually haven't seen a full e2e manual test yet (the Zimfarm has been running for 30+ minutes for my 1 article). |
Yes I fixed that and some other problems with my credentials. This credentials.py thing is such a nightmare, I can't wait to be done with it. |
af21eae to
544ecb6
Compare
Could you pin to a specific version: See https://api.farm.openzim.org/v2/offliners/mwoffliner/versions? I can't remember which version I pinned to but you could pin to 1.17.2. Also, I used a simple selection |
|
Actually I think there was a bug in my code where I was passing a |
I have tested this end to end in the development environment, and was successfully able to create a valid ZIM.
34c924e to
96a75b8
Compare
|
@audiodude do you intend to deploy this soon? It contains changes for production which would enable us to merge openzim/zimfarm#1422 and finally cleanup zimfarm codebase from transitory things around offliner definition versions. Do not forget you need to add the new |
|
@benoit74 This has been deployed with the updated config. I have https://farm.zimit.kiwix.org/recipes/wp1_selection_883c769321f6dd39 which was created from WP1 |
|
I was able to successfully download the ZIM created from WP1 |
|
Thank you! |
This PR represents a fundamental change in the operation of the WP1 development server. The entire Flask app (referred to generally as "web", the server running at localhost:5000) is pushed into the docker compose graph. It is joined there by instances of all the images and services needed by the Zimfarm. There are several associated configuration and logic changes as well.
Taken together, this means that a WP1 developer will be able to run the selection -> ZIM process end to end, without requiring any credentials on a live Zimfarm.
Additional Rationale
As part of cleanup in zimfarm API (openzim/zimfarm#1391), requests to create recipes/tasks now require an offliner definition version. This PR sets the version of the offliner definition from env variable and sets up zimfarm containers in a docker-compose graph. Previously, the API used "initial" as the definition versions but as scrapers evolve and arguments change, the definitions change too.
Changes